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DETAILED ACTION 

This action is in response to an amendment filed on July 12, 2007 for the 
application of Ballew et al., for a "System and method for detecting and managing HPC 
node failure" filed April 15, 2004. 
Claims 1-30 are pending in the application. 

Information disclosed and listed on PTO 1449 filed March 26, 2007, May 18, 2007 and 

September 14, 2007 have been considered. 

Claims 1-5, 7-15, 17-25, and 27-30 have been amended. 

Claims 1-30 are rejected under 35 USC § 103. 



Specification 

Examiner notes that Applicant is aware of trademarks and has amended the 
specification such that the proprietary nature of the marks are respected and has made 
to prevent their use in any manner which might adversely affect their validity as 
trademarks. 

Claim Rejections - 35 USC § 103 

The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 
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The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1 , 148 
USPQ 459 (1966), that are applied for establishing a background for determining 
obviousness under 35 U.S.C. 103(a) are summarized as follows: 

1 . Determining the scope and contents of the prior art. 

2. Ascertaining the differences between the prior art and the claims at issue. 

3. Resolving the level of ordinary skill in the pertinent art. 

4. Considering objective evidence present in the application indicating 
obviousness or nonobviousness. 

Claims 1-30 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Huang (U.S. Patent No. 5,748,882) in view of Karpoff (U.S. PGPub No. 20010049740). 

As per claim 1 , Huang discloses a method comprising: 
determining that one of a plurality of nodes has failed (col. 5, lines 15-19) 
removing the failed node from a virtual list of nodes, the virtual list comprising 
one logical entry for each of the plurality of nodes (col. 10, lines 45-50) 

Huang discloses an integrated fabric (col. 4, lines 66-67 through col. 5, lines 1-2). 
Each node contains communication links, communication ports (col. 10, lines 65-67), 
(col. 1 1 , lines 60-61), and the fault tolerance socket (col. 18, lines 28-31). However 
Huang fails to explicitly disclose a switching fabric integrated onto a board and one or 
more processors. 

Karpoff teaches: 

each node comprising a switching fabric integrated onto a board and one or more 
processors integrated onto the board (FIG. 4A, a typical INFINIBAND® Architecture 20 
includes one or more Central Processing Units (CPUs) 30, a Memory Controller 28, a 
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Host Interconnect 29, a Host Channel Adapter (HCA) 22, a Target Channel Adapter 
(TCA) 24, and one or more Switches 26 (page 6, paragraph [0089]). Karpoff further 
discloses nodes attached to the fabric can be assembled into logical subsets or 
partitions in order to group hosts or devices with like attributes, much like zoning 
capabilities of Fiber Channel fabrics (page 6, paragraph [0087]). 

It would have been obvious to one of ordinary skill in the art at the time of the 
invention to use the method and system for providing multimedia information of Karpoff 
in combination with the method for fault-tolerant computing of Huang to effectively 
monitor a multi-node system. 

One of ordinary skill in the art at the time of the invention would have been 
motivated to make the combination because Huang discloses an integrated fabric (col. 
4, lines 66-67 through col. 5, lines 1-2) wherein each node contains communication 
links, communication ports (col. 10, lines 65-67), (col. 11, lines 60-61). Huang's figure 2 
shows an example of a node, which has at least one processor (col. 4, lines 66-67 
through col. 5, lines 1-2). Karpoffs figure 4A shows a switching fabric, which includes 
one or more Central Processing Units (page 6, paragraph [0089]). 

As per claim 2, Huang discloses determining that at least a portion of a job was 
being executed on the failed node (col. 7, lines 49-55) and terminating at least the 
portion of the job (Fig. 5, element 511). 
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As per claim 3, Huang discloses determining that the job was associated with a 
subset of the plurality of nodes; and deallocating the subset of nodes from the job (col. 
7, lines 5-67 through col. 8, lines 1-9). 

As per claim 4, Huang discloses each entry of the virtual list comprising a node 
status and the method further comprising changing the status of each of the subset of 
nodes to "available" (col. 10, lines 50-56). 

As per claim 5, Huang discloses determining dimensions of the terminated job 
based on one or more job parameters and an associated policy; dynamically allocating 
a second subset of the plurality of nodes to the terminated job based on the determined 
dimensions (col. 17, lines 1-21) 

executing the terminated job on the allocated second subset (col. 7, lines 5-67 
through col. 8, lines 1-9). 

As per claim 6, Huang discloses the second subset comprising a substantially 
similar set of nodes to the first subset (Fig. 2). 

As per claim 7, Huang discloses dynamically allocating the second subset 
comprises: determining an optimum subset of nodes from a topology of unallocated 
nodes; and allocating the optimum subset (col. 7, lines 5-67 through col. 8, lines 1-9). 
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As per claim 8, Huang discloses locating a replacement node for the failed node; 
and updating the logical entry of the failed node with information on the replacement 
node (col. 7, lines 5-67 through col. 8, lines 1-9). 

As per claim 9, Huang discloses determining one of the plurality of nodes has 
failed comprises determining that a repeating communication has not been received 
from the failed node (col. 17, lines 21-30). 

As per claim 10, Huang discloses determining one of the plurality of nodes has 
failed is accomplished through polling (col. 8, lines 43-63). 

As per claim 11, Huang discloses software encoded in one or more computer- 
readable media and when executed operable to: 

determine that one of a plurality of nodes has failed (col. 5, lines 15-19) 

remove the failed node from a virtual list of nodes, the virtual list comprising one 
logical entry for each of the plurality of nodes (col. 10, lines 45-50) 

Huang discloses an integrated fabric (col. 4, lines 66-67 through col. 5, lines 1-2). 
Each node contains communication links, communication ports (col. 10, lines 65-67), 
(col. 11, lines 60-61), and the fault tolerance socket (col. 18, lines 28-31). However 
Huang fails to explicitly disclose a switching fabric integrated onto a board and one or 
more processors. 

Karpoff teaches: 
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each node comprising a switching fabric integrated onto a board and one or more 
processors integrated onto the board (FIG. 4A, a typical INFINIBAND® Architecture 20 
includes one or more Central Processing Units (CPUs) 30, a Memory Controller 28, a 
Host Interconnect 29, a Host Channel Adapter (HCA) 22, a Target Channel Adapter 
(TCA) 24, and one or more Switches 26 (page 6, paragraph [0089]). Karpoff further 
discloses nodes attached to the fabric can be assembled into logical subsets or 
partitions in order to group hosts or devices with like attributes, much like zoning 
capabilities of Fiber Channel fabrics (page 6, paragraph [0087]). 

It would have been obvious to one of ordinary skill in the art at the time of the 
invention to use the method and system for providing multimedia information of Karpoff 
in combination with the method for fault-tolerant computing of Huang to effectively 
monitor a multi-node system. 

One of ordinary skill in the art at the time of the invention would have been 
motivated to make the combination because Huang discloses an integrated fabric (col. 
4, lines 66-67 through col. 5, lines 1-2) wherein each node contains communication 
links, communication ports (col. 10, lines 65-67), (col. 11, lines 60-61). Huang's figure 2 
shows an example of a node, which has at least one processor (col. 4, lines 66-67 
through col. 5, lines 1-2). Karpoff s figure 4A shows a switching fabric, which includes 
one or more Central Processing Units (page 6, paragraph [0089]). 
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As per claim 12, Huang discloses to determine that at least a portion of a job was 
being executed on the failed node (col. 7, lines 49-55) and terminating at least a portion 
of the job (Fig. 5, element 51 1 ). 

As per claim 13, Huang discloses to determine that the Job was associated with 
a subset of the plurality of nodes; and deallocate the subset of nodes from the job (col. 
7, lines 5-67 through col. 8, lines 1-9). 

As per claim 14, Huang discloses each entry of the virtual list comprising a node 
status and the software further operable to change the status of each of the subset of 
nodes to "available" (col. 10, lines 50-56). 

As per claim 15, Huang discloses to determine dimensions of the terminated job 
based on one or more job parameters and an associated policy; dynamically allocate a 
second subset of the plurality of nodes to the terminated job based on the determined 
dimensions (col. 17, lines 1-21) 

executing the terminated job on the allocated second subset (col. 7, lines 5-67 
through col. 8, lines 1-9). 

As per claim 16, Huang discloses the second subset comprising a substantially 
similar set of nodes to the first subset (Fig. 2). 
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As per claim 17, Huang discloses the software operable to dynamically allocate 
the second subset comprises software operable to: determine an optimum subset of 
nodes from a topology of unallocated nodes; and allocate the optimum subset (col. 7, 
lines 5-67 through col. 8, lines 1-9). 

As per claim 18, Huang discloses to locate a replacement node for the failed 
node; and update the logical entry of the failed node with information on the 
replacement node (col. 7, lines 5-67 through col. 8, lines 1-9). 

As per claim 19, Huang discloses the software operable to determine one of the 
plurality of nodes has failed comprises software operable to determine that a repeating 
communication has not been received from the failed node (col. 17, lines 21-30). 

As per claim 20, Huang discloses the software operable to determine one of the 
plurality of nodes has failed is accomplished through polling (col. 8, lines 43-63). 

As per claim 21 , Huang discloses a system comprising: 
a plurality of nodes (Fig. 2) 

a management node (Fig. 2, element 104) operable to: 
determine that one of the plurality of nodes has failed (col. 5, lines 15-19) 
remove the failed node from a virtual list of nodes, the virtual list comprising one 
logical entry for each of the plurality of nodes (col. 10, lines 45-50) 
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Huang discloses an integrated fabric (col. 4, lines 66-67 through col. 5, lines 1-2). 
Each node contains communication links, communication ports (col. 10, lines 65-67), 
(col. 1 1 , lines 60-61), and the fault tolerance socket (col. 18, lines 28-31). However 
Huang fails to explicitly disclose a switching fabric integrated onto a board and one or 
more processors. 

Karpoff teaches: 

each node comprising a switching fabric integrated onto a board and one or more 
processors integrated onto the board (FIG. 4A, a typical INFINIBAND® Architecture 20 
includes one or more Central Processing Units (CPUs) 30, a Memory Controller 28, a 
Host Interconnect 29, a Host Channel Adapter (HCA) 22, a Target Channel Adapter 
(TCA) 24, and one or more Switches 26 (page 6, paragraph [0089]). Karpoff further 
discloses nodes attached to the fabric can be assembled into logical subsets or 
partitions in order to group hosts or devices with like attributes, much like zoning 
capabilities of Fiber Channel fabrics (page 6, paragraph [0087]). 

It would have been obvious to one of ordinary skill in the art at the time of the 
invention to use the method and system for providing multimedia information of Karpoff 
in combination with the method for fault-tolerant computing of Huang to effectively 
monitor a multi-node system. 

One of ordinary skill in the art at the time of the invention would have been 
motivated to make the combination because Huang discloses an integrated fabric (col. 
4, lines 66-67 through col. 5, lines 1-2) wherein each node contains communication 
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links, communication ports (col. 10, lines 65-67), (col. 11, lines 60-61). Huang's figure 2 
shows an example of a node, which has at least one processor (col. 4, lines 66-67 
through col. 5, lines 1-2). Karpoffs figure 4A shows a switching fabric, which includes 
one or more Central Processing Units (page 6, paragraph [0089]). 

As per claim 22, Huang discloses the management node further operable to: 
determine that at least a portion of a job was being executed on the failed node (col. 7, 
lines 49-55) and terminating at least a portion of the job (Fig. 5, element 51 1). 

As per claim 23, Huang discloses the management node further operable to: 
determine that the Job was associated with a subset of the plurality of nodes; and 
deallocate the subset of nodes from the job (col. 7, lines 5-67 through col. 8, lines 1-9). 

As per claim 24, Huang discloses each entry of the virtual list comprising a node 
status and the management node further operable to change the status of each of the 
subset of nodes to "available" (col. 10, lines 50-56). 

As per claim 25, Huang discloses the management node further operable to: 
determine dimensions of the terminated job based on one or more job parameters and 
an associated policy; dynamically allocate a second subset of the plurality of nodes to 
the terminated job based on the determined dimensions (col. 17, lines 1-21) 
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executing the terminated job on the allocated second subset (col. 7, lines 5-67 
through col. 8, lines 1-9). 

As per claim 26, Huang discloses the second subset comprising a substantially 
similar set of nodes to the first subset (Fig. 2). 

As per claim 27, Huang discloses the management node operable to dynamically 
allocate the second subset comprises the management node operable to: determine an 
optimum subset of nodes from a topology of unallocated nodes; and allocate the 
optimum subset (col. 7, lines 5-67 through col. 8, lines 1-9). 

As per claim 28, Huang discloses the management node further operable to: 
locate a replacement node for the failed node; and update the logical entry of the failed 
node with information on the replacement node (col. 7, lines 5-67 through col. 8, lines 1- 
9). 

As per claim 29, Huang discloses the management node operable to determine 
one of the plurality of nodes has failed comprises the management node operable to 
determine that a repeating communication has not been received from the failed node 
(col. 17, lines 21-30). 
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As per claim 30, Huang discloses the management node operable to determine 
one of the plurality of nodes has failed is accomplished through polling (col. 8, lines 43- 
63). 

Response to Arguments 

Applicant's arguments with respect to claims 1,11, and 21 have been considered 
but are moot in view of the new ground(s) of rejection. 

Conclusion 

THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time 
policy as set forth in 37 CFR 1 .136(a). 

A shortened statutory period for reply to this final action is set to expire THREE 
MONTHS from the mailing date of this action. In the event a first reply is filed within 
TWO MONTHS of the mailing date of this final action and the advisory action is not 
mailed until after the end of the THREE-MONTH shortened statutory period, then the 
shortened statutory period will expire on the date the advisory action is mailed, and any 
extension fee pursuant to 37 CFR 1 .136(a) will be calculated from the mailing date of 
the advisory action. In no event, however, will the statutory period for reply expire later 
than SIX MONTHS from the mailing date of this final action. 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Elmira Mehrmanesh whose telephone number is (571) 
272-5531 . The examiner can normally be reached on 9-5 M-F. 
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If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Robert W. Beausoliel can be reached on (571) 272-3645. The fax phone 
number for the organization where this application or proceeding is assigned is 571- 
273-8300. 

Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). 



